feat(runner): add git safety guardrails to system prompt#1360
feat(runner): add git safety guardrails to system prompt#1360jeremyeder merged 7 commits intoambient-code:mainfrom
Conversation
Inject concise git safety rules into the session system prompt when repos are configured. Covers force push, ref deletion, main branch protection, destructive operations, and token exposure. Replaces the over-engineered approach from ambient-code#1225 (307-line regex module + 344-line test suite) with 15 lines of prompt text and 2 tests. Closes ambient-code#1111 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
✅ Deploy Preview for cheerful-kitten-f556a0 canceled.
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughAdded a module-level Changes
🚥 Pre-merge checks | ✅ 7 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (7 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/runners/ambient-runner/ambient_runner/platform/prompts.py`:
- Around line 82-87: Update the prompt text that currently contains the guidance
lines starting with "5. **NEVER run destructive operations without a backup**"
and "6. **NEVER embed tokens in commands**" to require explicit user
confirmation before performing destructive local git commands (`git reset
--hard`, `git clean -fd`, `git checkout -- .`): add a clear step requiring the
operator to request and receive a literal, explicit confirmation from the user
(e.g., user must type "I CONFIRM" or similar) before the runner will output or
execute any destructive command, and include the requirement that a named backup
branch be created first; modify the prompt string in
components/runners/ambient-runner/ambient_runner/platform/prompts.py so the
guidance enforces an explicit confirmation handshake for the identified
commands.
In `@components/runners/ambient-runner/tests/test_auto_push.py`:
- Around line 320-350: The test only asserts the presence of a single safety
phrase ("NEVER force push") in test_prompt_includes_git_safety_with_repos, which
lets other guardrails regress unnoticed; update the tests that call
build_workspace_context_prompt (e.g., test_prompt_includes_git_safety_with_repos
and test_prompt_excludes_git_safety_without_repos) to assert all expected
safety-critical phrases are present when repos_cfg is non-empty (examples:
"NEVER force push", "DO NOT delete refs", "DO NOT expose API keys", "RESTRICT
tokens", "PROTECT main branch" or whatever exact phrases
build_workspace_context_prompt emits) and assert none of those phrases appear
when repos_cfg is empty; locate the checks by referencing
build_workspace_context_prompt and the two test functions and add multiple
explicit assertions covering each guardrail phrase rather than relying on a
single substring.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 43d59d5b-69a2-49dd-b2a8-3f142359f2bc
📒 Files selected for processing (2)
components/runners/ambient-runner/ambient_runner/platform/prompts.pycomponents/runners/ambient-runner/tests/test_auto_push.py
| "5. **NEVER run destructive operations without a backup** — before " | ||
| "`git reset --hard`, `git clean -fd`, or `git checkout -- .`, " | ||
| "create a backup branch first.\n" | ||
| "6. **NEVER embed tokens in commands** — use environment variables.\n\n" | ||
| "When a git operation fails: stop, diagnose, report, wait for the user.\n\n" | ||
| ) |
There was a problem hiding this comment.
Add explicit user-confirmation requirement for destructive local git operations
Line 82–Line 84 requires backup branches, but still permits destructive commands without explicit consent. Given the incident goals, this should require confirmation before reset --hard / clean -fd / checkout -- ..
Suggested prompt update
GIT_SAFETY_INSTRUCTIONS = (
@@
- "5. **NEVER run destructive operations without a backup** — before "
- "`git reset --hard`, `git clean -fd`, or `git checkout -- .`, "
- "create a backup branch first.\n"
+ "5. **NEVER run destructive operations without explicit user approval** — "
+ "before `git reset --hard`, `git clean -fd`, or `git checkout -- .`, "
+ "ask the user for confirmation and create a backup branch first.\n"
@@
- "When a git operation fails: stop, diagnose, report, wait for the user.\n\n"
+ "When a git operation fails: stop, diagnose, report, wait for the user.\n\n"
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "5. **NEVER run destructive operations without a backup** — before " | |
| "`git reset --hard`, `git clean -fd`, or `git checkout -- .`, " | |
| "create a backup branch first.\n" | |
| "6. **NEVER embed tokens in commands** — use environment variables.\n\n" | |
| "When a git operation fails: stop, diagnose, report, wait for the user.\n\n" | |
| ) | |
| "5. **NEVER run destructive operations without explicit user approval** — " | |
| "before `git reset --hard`, `git clean -fd`, or `git checkout -- .`, " | |
| "ask the user for confirmation and create a backup branch first.\n" | |
| "6. **NEVER embed tokens in commands** — use environment variables.\n\n" | |
| "When a git operation fails: stop, diagnose, report, wait for the user.\n\n" | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/runners/ambient-runner/ambient_runner/platform/prompts.py` around
lines 82 - 87, Update the prompt text that currently contains the guidance lines
starting with "5. **NEVER run destructive operations without a backup**" and "6.
**NEVER embed tokens in commands**" to require explicit user confirmation before
performing destructive local git commands (`git reset --hard`, `git clean -fd`,
`git checkout -- .`): add a clear step requiring the operator to request and
receive a literal, explicit confirmation from the user (e.g., user must type "I
CONFIRM" or similar) before the runner will output or execute any destructive
command, and include the requirement that a named backup branch be created
first; modify the prompt string in
components/runners/ambient-runner/ambient_runner/platform/prompts.py so the
guidance enforces an explicit confirmation handshake for the identified
commands.
| def test_prompt_includes_git_safety_with_repos(self): | ||
| """Git safety guardrails are included when repos are present.""" | ||
| repos_cfg = [ | ||
| { | ||
| "name": "my-repo", | ||
| "url": "https://github.com/owner/my-repo.git", | ||
| "branch": "main", | ||
| "autoPush": False, | ||
| } | ||
| ] | ||
| prompt = build_workspace_context_prompt( | ||
| repos_cfg=repos_cfg, | ||
| workflow_name=None, | ||
| artifacts_path="artifacts", | ||
| ambient_config={}, | ||
| workspace_path="/workspace", | ||
| ) | ||
| assert "Git Safety Guardrails" in prompt | ||
| assert "NEVER force push" in prompt | ||
|
|
||
| def test_prompt_excludes_git_safety_without_repos(self): | ||
| """Git safety guardrails are excluded when no repos are present.""" | ||
| prompt = build_workspace_context_prompt( | ||
| repos_cfg=[], | ||
| workflow_name=None, | ||
| artifacts_path="artifacts", | ||
| ambient_config={}, | ||
| workspace_path="/workspace", | ||
| ) | ||
| assert "Git Safety Guardrails" not in prompt | ||
|
|
There was a problem hiding this comment.
Strengthen guardrail assertions to cover all safety-critical rules
Line 337 and Line 338 only lock one rule (NEVER force push). This can miss regressions in ref-deletion/API-ref/token/main-branch protections while tests still pass.
Suggested test hardening
def test_prompt_includes_git_safety_with_repos(self):
"""Git safety guardrails are included when repos are present."""
@@
assert "Git Safety Guardrails" in prompt
assert "NEVER force push" in prompt
+ assert "NEVER delete remote branches or refs" in prompt
+ assert "NEVER manipulate git refs via the GitHub/GitLab REST API" in prompt
+ assert "NEVER push to main/master" in prompt
+ assert "NEVER run destructive operations without a backup" in prompt
+ assert "NEVER embed tokens in commands" in prompt🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/runners/ambient-runner/tests/test_auto_push.py` around lines 320 -
350, The test only asserts the presence of a single safety phrase ("NEVER force
push") in test_prompt_includes_git_safety_with_repos, which lets other
guardrails regress unnoticed; update the tests that call
build_workspace_context_prompt (e.g., test_prompt_includes_git_safety_with_repos
and test_prompt_excludes_git_safety_without_repos) to assert all expected
safety-critical phrases are present when repos_cfg is non-empty (examples:
"NEVER force push", "DO NOT delete refs", "DO NOT expose API keys", "RESTRICT
tokens", "PROTECT main branch" or whatever exact phrases
build_workspace_context_prompt emits) and assert none of those phrases appear
when repos_cfg is empty; locate the checks by referencing
build_workspace_context_prompt and the two test functions and add multiple
explicit assertions covering each guardrail phrase rather than relying on a
single substring.
Keep only token redaction and escalation protocol — these are universally correct. Remove opinionated rules (force push policy, backup branches, ref deletion) that should be opt-in per-project. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Closes #1111
Supersedes #1225
Test plan
uv run pytest tests/test_auto_push.py)🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests